pacman::p_load(tidyverse, sf, sfdep, tmap, mapview, plotly, knitr)Take-home Exercise 1: Geospatial Analytics for Public Good
1 Introduction
1.1 Background
Buses and the Mass Rapid Transit (MRT) system are the main modes of transport used by Singaporeans for their daily commutes. With an ever increasing population, the key challenge for public bus operators is balancing supply against demand by optimising its services, fleets, and manpower for different bus routes.
Hence, the identification and analysis of the movement patterns of public buses can provide insights on commuting in Singapore, allowing for the development of better urban management strategies by both the public and private sectors.
1.2 The Study Area
The study area is Singapore, a city-state in Southeast Asia, with a total land area of about 730 square kilometres and a total population of 5.92 million (as at June 2023).
Commuting by public bus is the most common mode of transport in Singapore, with an average daily ridership of about 3.461 million in 2022, according to the LTA. The public bus fleet is approximately 5,800, plying more than 300 routes and 5,000 bus stops.
1.3 The Analytical Questions
In this take-home exercise, I aim to answer the following analytical questions:
Is the demand for public bus transport evenly distributed geographically?
If no, are there signs of spatial clustering?
If yes, where are these clusters, and what kind of clusters are they?
Based on the analysis, what are the insights that can be derived for urban transport planning?
1.4 Objectives and Tasks
Hence, the objectives of this take-home exercise are to:
Apply Exploratory Spatial Data Analysis (ESDA) to obtain a preliminary understanding of public bus movements in Singapore; and
Apply appropriate Local Indicators of Spatial Association (LISA) Analysis or Emerging Hot Spot Analysis (EHSA) to undercover the spatial and/or spatio-temporal mobility patterns of public bus passengers in Singapore.
The detailed tasks are:
Geovisualisation and Analysis:
Compute passenger trips generated by origin at the hexagon level for:
Peak Hour Period Bus Tap On Time Weekday Morning Peak 6am to 9am Weekday Afternoon Peak 5pm to 8pm Weekend/Holiday Morning Peak 11am to 2pm Weekend/Holiday Evening Peak 4pm to 7pm Display the geographical distribution of the passenger trips using appropriate geovisualisation method.
Describe the spatial patterns revealed by the geovisualisation.
EITHER:
Local Indicators of Spatial Association (LISA) Analysis:
Compute LISA of the passengers trips generated by origin at the hexagon level.
Display the LISA maps of the passengers trips generated by origin at the hexagon level - for those that are statistically significant.
Draw statistical conclusions from the analysis results.
OR:
Emerging Hot Spot Analysis (EHSA) with reference to the passenger trips generated by origin at the hexagon level for the four time intervals explored in 1. above:
Perform Mann-Kendall Test using the spatio-temporal local Gi* values.
Prepare EHSA maps of the Gi* values of the passenger trips generated by origin at the hexagon level - for those that are statistically significant.
Describe the spatial patterns revealed with reference to the EHSA maps and data visualisation prepared.
Either 2. or 3. is required to be completed. I have attempted to complete both to obtain a more comprehensive understanding of the patterns in the data.
2 Getting Started
2.1 Setting the Analytical Tools
The R packages used in this take-home exercise are:
tidyverse (i.e. readr, tidyr, dplyr) for performing data science tasks such as importing, tidying, and wrangling data;
sf for importing, managing, and processing geospatial data;
sfdep for analysing spatial dependence and spatial relationships in data (building on spdep);
tmap for thematic mapping;
mapview for interactive viewing of spatial objects;
plotly for making interactive plots; and
knitr for embedding R code in different document formats (e.g., HTML) to facilitate dynamic report generation.
They are loaded into the R environment using the following code:
2.2 Data Sources
The Land Transport Authority (LTA) studies commuter movements using the data collected from the use of smart cards and the Global Positioning System (GPS) devices on public buses. The LTA DataMall shares some of these data publicly, which helps to speed up the development of practical solutions to enhance reliability and efficiency of the transport system by other stakeholders, such as the private sector and individuals.
The data sets used in this take-home exercise are:
Passenger Volume by Origin Destination Bus Stops from the LTA DataMall for the period of August 2023 to October 2023. There are three aspatial data sets (one for each month) in csv format. It contains data on the number of trips by weekdays and weekends from origin to destination bus stops.
Bus Stop Location from the LTA DataMall. This is a geospatial data set in ESRI shapefile format. It contains a point representation that indicates the position of each bus stop where buses stop to pick up or drop off passengers.
The data sets are placed under two sub-folders:
geospatial (Bus Stop Location), and
aspatial (Passenger Volume by Origin Destination Bus Stops).
These two sub-folders are within the data folder of my In-class_Ex1 folder.
3 Data Wrangling
3.1 Importing and Merging the Aspatial Data - Passenger Volume
The three csv files are imported and combined into a tibble data frame, odbus, using functions from the base package as shown in the following codes:
- Generate list of file names of csv files using the
list.files()function in the base package.
Show the code
filenames = list.files(path="data/aspatial/", pattern="*.csv")
filenames[1] "origin_destination_bus_202308.csv" "origin_destination_bus_202309.csv"
[3] "origin_destination_bus_202310.csv"
- Generate path to csv files using the
file.path()function in the base package.
Show the code
path = file.path("data/aspatial", filenames)
path[1] "data/aspatial/origin_destination_bus_202308.csv"
[2] "data/aspatial/origin_destination_bus_202309.csv"
[3] "data/aspatial/origin_destination_bus_202310.csv"
- Merge the three csv files using the
do.call()andlapply()functions in the base package and theread.csv()function from the readr package.
Show the code
odbus = do.call("rbind", lapply(path, FUN=function(files){ read.csv(files)}))- Check using the
unique()function in the base package to confirm that the data sets for each of the three months have been incorporated. [Note: Although the three data sets have been merged into a single data frame, the observations would be analysed month by month, instead of aggregating all observations.]
Show the code
unique(odbus$YEAR_MONTH)[1] "2023-08" "2023-09" "2023-10"
Once the combined data set has been obtained, the n_distinct() function in the dplyr package and the sum() function in the base package are applied to uncover the following observations regarding odbus:
Overall, there are 17,118,005 rows (observations) and 7 columns. 5,709,512, 5,714,196, and 5,694,297 rows (observations) from August, September, and October 2023 respectively, reflecting a generally stable total ridership over the three-month period.
The “YEAR_MONTH” column has three unique values, reflecting the observations from August, September, and October 2023.
The “DAY_TYPE” column has two unique values, reflecting observations from weekday or weekends/holiday.
The “TIME_PER_HOUR” has 24 unique values, reflecting that the observations are broken down into each of the 24 hours of a day.
The “PT_TYPE” column refers only has the value of “BUS” as the public transport type. Hence, it may be dropped.
The “ORIGIN_PT_CODE” and “DESTINATION_PT_CODE” columns have 5,075 and 5,079 unique values respectively, reflecting the number of bus stops with bus routes passing through them.
The “TOTAL_TRIPS” column contains values that reflect the number of passengers for each unique month, type of day, origin bus stop, destination bus stop. The minimum value is 1, i.e., there are no rows with zero values.
Show the code
sapply(odbus, function(x) n_distinct(x)) YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE
3 2 24 1
ORIGIN_PT_CODE DESTINATION_PT_CODE TOTAL_TRIPS
5075 5079 4854
Show the code
sum(odbus$YEAR_MONTH == "2023-08")[1] 5709512
Show the code
sum(odbus$YEAR_MONTH == "2023-09")[1] 5714196
Show the code
sum(odbus$YEAR_MONTH == "2023-10")[1] 5694297
The output is saved in rds format. The rds file is then imported into the R environment.
Show the code
write_rds(odbus, "data/rds/odbus.rds")
odbus = read_rds("data/rds/odbus.rds")3.2 Importing and Transforming the Geospatial Data - Bus Stop Location
The Bus Stop Location shapefile is imported using st_read() in the sf package. The output is a simple feature data frame, busstop, which is in the SVY21 projected coordinates systems. It has 5,161 features and 3 fields, which include the geometry points indicating the bus stop locations.
Show the code
busstop = st_read(dsn = "data/geospatial",
layer = "BusStop")Reading layer `BusStop' from data source
`C:\jmphosis\ISSS624\Take-home_Ex\Take-home_Ex1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
Show the code
glimpse(busstop)Rows: 5,161
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
The st_crs() function in the sf package is then used to check the coordinate system of the busstop simple feature data frame. The output shows that although it is projected in SVY21, the EPSG is indicated as 9001, which is incorrect given that it should be 3414 instead.
Show the code
st_crs(busstop)Coordinate Reference System:
User input: SVY21
wkt:
PROJCRS["SVY21",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ID["EPSG",6326]],
PRIMEM["Greenwich",0,
ANGLEUNIT["Degree",0.0174532925199433]]],
CONVERSION["unnamed",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
To correct the EPSG, the st_set_crs() function in the sf package is applied. A check to confirm that the projection transformation has been applied is then made using the st_crs() function again.
Show the code
busstop = st_set_crs(busstop, 3414)
st_crs(busstop)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
3.3 Checking for Duplicates and Missing Values
The data sets from the LTA DataMall are expected to be relatively clean. Nevertheless, due diligence checks for duplicates and missing values are still made to confirm the assumptions.
- The
duplicated()function in the base package is used to check for duplicates inodbus. There are no duplicates in the tibble data frame.
Show the code
odbus[duplicated(odbus), ][1] YEAR_MONTH DAY_TYPE TIME_PER_HOUR
[4] PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
[7] TOTAL_TRIPS
<0 rows> (or 0-length row.names)
- The
duplicated()function in the base package andst_geometry()function in the sf package are used to check for duplicates inbusstop. The output returned Bus Stop No. 96319 at row 3265. On closer inspection of the simple feature data frame, rows 3264 and 3265 are found to be duplicates. Hence, the row 3264 is removed using therow_number()function in the dplyr package. The same check is conducted again to confirm that there are no more duplicates.
Show the code
busstop[duplicated(st_geometry(busstop)), ]Simple feature collection with 1 feature and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 42187.23 ymin: 34995.78 xmax: 42187.23 ymax: 34995.78
Projected CRS: SVY21 / Singapore TM
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
3265 96319 NIL YUSEN LOGISTICS POINT (42187.23 34995.78)
Show the code
busstop = busstop %>%
filter(row_number() != 3264)
busstop[duplicated(st_geometry(busstop)), ]Simple feature collection with 0 features and 3 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
<0 rows> (or 0-length row.names)
- The
is.na()andsum()functions in the base package and thesummarise_all()function in the dplyr package are used to check for missing values inodbus. There are no missing values in the tibble data frame.
Show the code
odbus %>%
summarise_all(~ sum(is.na(.))) YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
1 0 0 0 0 0 0
TOTAL_TRIPS
1 0
- The
is.na()andsum()functions in the base package are used to check for missing values inbusstop. There are no missing values in the simple feature data frame.
Show the code
sum(is.na(st_geometry(busstop)))[1] 0
3.4 Aggregating and Selecting Columns
Given that the analysis focuses on passenger volume by origin bus stop, the group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) by “ORIGIN_PT_CODE”, “YEAR_MONTH”, “DAY_TYPE”, and “TIME_PER_HOUR” in odbus. At the same time, the columns “DESTINATION_PT_CODE” and “PT_TYPE” are dropped as they are not required. The number of rows (observations) are condensed from 17,118,005 into 571,696.
Show the code
odbus = odbus %>%
group_by(ORIGIN_PT_CODE, YEAR_MONTH, DAY_TYPE, TIME_PER_HOUR) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The select() function in dplyr package is used to drop the “BUS_ROOF_N” column in busstop since it is not required.
Show the code
busstop = busstop %>%
select(-BUS_ROOF_N)3.5 Performing Relational Join
The simple feature data frame, busstop, and the tibble data frame, odbus, are combined using the inner_join() function in the dplyr package, by matching the “BUS_STOP_N” column in busstop with the “ORIGIN_PT_CODE” in odbus. However, as the “BUS_STOP_N” values are character values, they would need to be converted to integer values first before the join, so that values such as “01012” in busstop can be matched to values such as “1012” in odbus.
Upon joining the two data frames, the output is the simple feature data frame (since busstop was placed as the first argument in the inner_join() function), bus. The number of rows (observations) was reduced from 571,696 to 567,725, i.e., 3,841 rows are dropped. This is likely because some newer bus stops such as Bus Stop No. 3549 (Shenton Way Stn Exit 3) and Bus Stop No. 96461 (Bef Aviation Pk Rd) were not included in the Bus Stop Location shapefile that was last updated in July 2023. This means that the analysis in this take-home exercise is limited in this aspect.
busstop = busstop %>%
mutate(BUS_STOP_N = as.integer(BUS_STOP_N))
bus = inner_join(busstop, odbus,
by = c("BUS_STOP_N" = "ORIGIN_PT_CODE"))The output is saved in rds format. The rds file is then imported into the R environment.
Show the code
write_rds(bus, "data/rds/bus.rds")
bus = read_rds("data/rds/bus.rds")3.6 Creating Spatial Hexagon Grid
Spatial grids are commonly used in spatial analysis. Regularly shaped grids may comprise of equilateral triangles, squares, or hexagons. The hexagon is the most circular-shaped polygon that can tessellate to form an evenly spaced grid, providing a low perimeter-to-area ratio that reduces sampling bias due to edge effects of the grid shape.
A more circular polygon means that points near the border are closer to the centroid. Hexagons are often used when the analysis involves aspects of connectivity or movement paths. Locating neighbours is simpler using a hexagon grid because the contact edge or length is consistent on each side, resulting in equidistant centroids for each neighbour.
A spatial hexagon grid, hgrid, for the study area of Singapore is created using the following codes:
area_hexagon_grid = st_make_grid(bus, c(500, 500), what = "polygons", square = FALSE)
hgrid = st_sf(area_hexagon_grid) %>%
mutate(grid_id = 1:length(lengths(area_hexagon_grid)))4 Exploratory Data Analysis - Computing, Visualising, and Deriving Insights on Passenger Trips by Origin
The steps in 4.1 are repeated for the other three peak periods studied in this take-home exercise.
4.1 Weekday Morning Peak (6am to 9.59am)
4.1.1 Computing Passenger Trips by Origin
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frames, wdAMAug, wdAMSep, and wdAMOct for the passenger trips by origin during weekday morning peak periods in August, September, and October 2023 respectively.
wdAMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9)Show the code for Sep and Oct
wdAMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9)
wdAMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdAMAug_ct = st_join(hgrid, wdAMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))Show the code for Sep and Oct
wdAMSep_ct = st_join(hgrid, wdAMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
wdAMOct_ct = st_join(hgrid, wdAMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted using functions in the tmap package. To facilitate comparisons between different peak periods, the breaks are set at fixed intervals of 100,000 passengers.
brk = c(1, 100000, 200000, 300000, 400000, 500000, 600000)map_wdAMAug = tm_shape(wdAMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Aug 2023")Show the code for Sep and Oct
map_wdAMSep = tm_shape(wdAMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Sep 2023")
map_wdAMOct = tm_shape(wdAMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Oct 2023")4.1.2 Visualising Passenger Trips by Origin
The plots from the three months are then visualised in the interactive tmap mode below.
Aug 2023
tmap_mode("view")
map_wdAMAugSep 2023
map_wdAMSepOct 2023
map_wdAMOct4.1.3 Deriving Insights on Weekday Morning Peak
xxx
4.2 Weekday Afternoon Peak (5pm to 8.59pm)
4.2.1 Computing Passenger Trips by Origin
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frames, wdPMAug, wdPMSep, and wdPMOct for the passenger trips by origin during weekday afternoon peak periods in August, September, and October 2023 respectively.
Show the code
wdPMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20)
wdPMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20)
wdPMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
Show the code
wdPMAug_ct = st_join(hgrid, wdPMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
wdPMSep_ct = st_join(hgrid, wdPMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
wdPMOct_ct = st_join(hgrid, wdPMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted using functions in the tmap package.
Show the code
map_wdPMAug = tm_shape(wdPMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Aug 2023")
map_wdPMSep = tm_shape(wdPMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Sep 2023")
map_wdPMOct = tm_shape(wdPMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Oct 2023")4.2.2 Visualising Passenger Trips by Origin
The plots from the three months are then visualised in the interactive tmap mode below.
Aug 2023
map_wdPMAugSep 2023
map_wdPMSepOct 2023
map_wdPMOct4.2.3 Deriving Insights on Weekday Afternoon Peak
xxx
4.3 Weekend/Holiday Morning Peak (11am to 2.59pm)
4.3.1 Computing Passenger Trips by Origin
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frames, weAMAug, weAMSep, and weAMOct for the passenger trips by origin during weekend/holiday morning peak periods in August, September, and October 2023 respectively.
Show the code
weAMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14)
weAMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14)
weAMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
Show the code
weAMAug_ct = st_join(hgrid, weAMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
weAMSep_ct = st_join(hgrid, weAMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
weAMOct_ct = st_join(hgrid, weAMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted using functions in the tmap package.
Show the code
map_weAMAug = tm_shape(weAMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Aug 2023")
map_weAMSep = tm_shape(weAMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Sep 2023")
map_weAMOct = tm_shape(weAMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Oct 2023")4.3.2 Visualising Passenger Trips by Origin
The plots from the three months are then visualised in the interactive tmap mode below.
Aug 2023
map_weAMAugSep 2023
map_weAMSepOct 2023
map_weAMOct4.3.3 Deriving Insights on Weekend/Holiday Morning Peak
xxx
4.4 Weekend/Holiday Evening Peak (4pm to 7.59pm)
4.4.1 Computing Passenger Trips by Origin
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frames, wePMAug, wePMSep, and wePMOct for the passenger trips by origin during weekend/holiday evening peak periods in August, September, and October 2023 respectively.
Show the code
wePMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 19)
wePMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 19)
wePMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 19)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
Show the code
wePMAug_ct = st_join(hgrid, wePMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
wePMSep_ct = st_join(hgrid, wePMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))
wePMOct_ct = st_join(hgrid, wePMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted using functions in the tmap package.
Show the code
map_wePMAug = tm_shape(wePMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Aug 2023")
map_wePMSep = tm_shape(wePMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Sep 2023")
map_wePMOct = tm_shape(wePMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Oct 2023")4.4.2 Visualising Passenger Trips by Origin
The plots from the three months are then visualised in the interactive tmap mode below.
Aug 2023
map_wePMAugSep 2023
map_wePMSepOct 2023
map_wePMOct4.4.3 Deriving Insights on Weekend/Holiday Evening Peak
xxx
4.5 Comparing Between Different Peak Periods
4.5.1 Weekday Morning versus Afternoon
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(map_wdAMAug, map_wdPMAug, ncol = 2)Sep 2023
tmap_arrange(map_wdAMSep, map_wdPMSep, ncol = 2)Oct 2023
tmap_arrange(map_wdAMOct, map_wdPMOct, ncol = 2)4.5.2 Weekend/Holiday Morning versus Evening
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(map_weAMAug, map_wePMAug, ncol = 2)Sep 2023
tmap_arrange(map_weAMSep, map_wePMSep, ncol = 2)Oct 2023
tmap_arrange(map_weAMOct, map_wePMOct, ncol = 2)4.5.3 Weekday versus Weekend/Holiday Morning
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(map_wdAMAug, map_weAMAug, ncol = 2)Sep 2023
tmap_arrange(map_wdAMSep, map_weAMSep, ncol = 2)Oct 2023
tmap_arrange(map_wdAMOct, map_weAMOct, ncol = 2)4.5.4 Weekday Afternoon versus Weekend/Holiday Evening
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(map_wdPMAug, map_wePMAug, ncol = 2)Sep 2023
tmap_arrange(map_wdPMSep, map_wePMSep, ncol = 2)Oct 2023
tmap_arrange(map_weAMOct, map_wePMOct, ncol = 2)4.5.4 Insights from Comparing Different Peak Periods
xxx
5 Local Indicators of Spatial Association (LISA) Analysis
Local Indicators of Spatial Association (LISA) are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. The appropriate LISA, including local Moran’s I, are applied to detect cluster(s) and/or outlier(s) for the Bus Passenger Trips in the bus simple feature data frame, month by month.
5.1 Computing Distance-based Weights
Distance-based weights are used in this take-home exercise. Distance-based weights are more appropriate than contiguity spatial weights in this context given that the hexagons are spread out across Singapore, and some hexagons may not have immediate neighbours (i.e., contiguous) due to reasons such as being separated by reservoirs or waterways.
There are three popularly used distance-based spatial weights, they are: fixed distance weights, adaptive distance weights, and inverse distance weights (IDW). For this take-home exercise, a combination of IDW and adaptive distance weights are used. IDW assigns higher weights to neighbours that are closer and lower weights to neighbours that are further away, while adaptive distance weights balances out the number of neighbours for each hexagon regardless of how densely populated its surrounding areas is (i.e., density of nearby bus stops).
The
st_knn()function in the sfdep package is used to identify neighbors based on k (i.e. k = 8 indicates the nearest eight neighbours). The output is a list of neighbours (i.e.nb).The
st_inverse_distance()function in the sfdep package is then used to calculate inverse distance weights of neighbours in thenb.
For Weekday Morning Peak:
wdAMAug_wt = wdAMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)Show the code for Sep and Oct
wdAMSep_wt = wdAMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdAMOct_wt = wdAMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekday Afternoon Peak:
Show the code
wdPMAug_wt = wdPMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdPMSep_wt = wdPMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdPMOct_wt = wdPMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekend/Holiday Morning Peak:
Show the code
weAMAug_wt = weAMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
weAMSep_wt = weAMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
weAMOct_wt = weAMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekend/Holiday Evening Peak:
Show the code
wePMAug_wt = wePMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wePMSep_wt = wePMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wePMOct_wt = wePMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)5.2 Computing Local Indicators of Spatial Association (LISA) of Passenger Trips by Origin
The local_moran() function in the sfdep package is used to compute the local Moran’s I value of “TOTAL_TRIPS” at the hexagon level. The output is a tibble data frame, lisa.
- The output is a simple feature data frame containing the columns ii, eii, var_ii, z_ii, p_ii, p_ii_sim, and p_folded_sim.
ii: Local moran statistic.
eii: Expectation of local moran statistic; for local_moran_perm(), the permutation sample means.
var_ii: Variance of local moran statistic; for local_moran_perm(), the permutation sample standard deviations.
z_ii: Standard deviation of local moran statistic; for local_moran_perm(), based on permutation of sample means and standard deviations
p_ii: p-value of local moran statistic using pnorm(); for local_moran_perm(), using standard deviation based on permutation of sample means and standard deviations
p_ii_sim: For local_moran_perm(), rank() and punif() of observed statistic rank for [0, 1] p-values using “alternative=”
p_folded_sim: The simulation folded [0, 0.5] range ranked p-value.
skewness: For local_moran_perm(), the output of e1071::skewness() for the permutation samples underlying the standard deviates
kurtosis: For local_moran_perm(), the output of e1071::kurtosis() for the permutation samples underlying the standard deviates.
- The
unnest()function in the tidyr package is used to expand a list-column containing data frames into rows and columns. - In terms of interpretation:
- A positive local Moran’s I value indicates a hexagon close to similar values (High-High or Low-Low);
- A negative value indicates a hexagon close to dissimilar values (High-Low or Low-High); and
- A value near zero suggests no significant local spatial autocorrelation.
For Weekday Morning Peak:
wdAMAug_lisa = wdAMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)Show the code for Sep and Oct
wdAMSep_lisa = wdAMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdAMOct_lisa = wdAMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekday Afternoon Peak:
Show the code
wdPMAug_lisa = wdPMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdPMSep_lisa = wdPMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdPMOct_lisa = wdPMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekend/Holiday Morning Peak:
Show the code
weAMAug_lisa = weAMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
weAMSep_lisa = weAMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
weAMOct_lisa = weAMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekend/Holiday Evening Peak:
Show the code
wePMAug_lisa = wePMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wePMSep_lisa = wePMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wePMOct_lisa = wePMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)5.3 Creating Local Indicators of Spatial Association (LISA) Maps of Passenger Trips by Origin
The LISA map is a categorical map showing outliers and clusters. There are two types of outliers namely: High-Low and Low-High outliers. Likewise, there are two type of clusters namely: High-High and Low-Low clusters. The LISA map is an interpreted map combining local Moran’s I of geographical areas and their respective p-values.
For Weekday Morning Peak:
wdAMAug_lisasig = wdAMAug_lisa %>%
filter(p_ii_sim < 0.05)
wdAMAug_lisamap = tm_shape(wdAMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Aug 2023")Show the code for Sep and Oct
wdAMSep_lisasig = wdAMSep_lisa %>%
filter(p_ii_sim < 0.05)
wdAMSep_lisamap = tm_shape(wdAMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Sep 2023")
wdAMOct_lisasig = wdAMOct_lisa %>%
filter(p_ii_sim < 0.05)
wdAMOct_lisamap = tm_shape(wdAMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Oct 2023")For Weekday Afternoon Peak:
Show the code
wdPMAug_lisasig = wdPMAug_lisa %>%
filter(p_ii_sim < 0.05)
wdPMAug_lisamap = tm_shape(wdPMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Aug 2023")
wdPMSep_lisasig = wdPMSep_lisa %>%
filter(p_ii_sim < 0.05)
wdPMSep_lisamap = tm_shape(wdPMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Sep 2023")
wdPMOct_lisasig = wdPMOct_lisa %>%
filter(p_ii_sim < 0.05)
wdPMOct_lisamap = tm_shape(wdPMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Oct 2023")For Weekend/Holiday Morning Peak:
Show the code
weAMAug_lisasig = weAMAug_lisa %>%
filter(p_ii_sim < 0.05)
weAMAug_lisamap = tm_shape(weAMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Aug 2023")
weAMSep_lisasig = weAMSep_lisa %>%
filter(p_ii_sim < 0.05)
weAMSep_lisamap = tm_shape(weAMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Sep 2023")
weAMOct_lisasig = weAMOct_lisa %>%
filter(p_ii_sim < 0.05)
weAMOct_lisamap = tm_shape(weAMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Oct 2023")For Weekend/Holiday Evening Peak:
Show the code
wePMAug_lisasig = wePMAug_lisa %>%
filter(p_ii_sim < 0.05)
wePMAug_lisamap = tm_shape(wePMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Aug 2023")
wePMSep_lisasig = wePMSep_lisa %>%
filter(p_ii_sim < 0.05)
wePMSep_lisamap = tm_shape(wePMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Sep 2023")
wePMOct_lisasig = wePMOct_lisa %>%
filter(p_ii_sim < 0.05)
wePMOct_lisamap = tm_shape(wePMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Oct 2023")5.4 Comparing Between Different Peak Periods
5.4.1 Weekday Morning versus Afternoon
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(wdAMAug_lisamap, wdPMAug_lisamap, ncol = 2)Sep 2023
tmap_arrange(wdAMSep_lisamap, wdPMSep_lisamap, ncol = 2)Oct 2023
tmap_arrange(wdAMOct_lisamap, wdPMOct_lisamap, ncol = 2)5.4.2 Weekend/Holiday Morning versus Evening
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(weAMAug_lisamap, wePMAug_lisamap, ncol = 2)Sep 2023
tmap_arrange(weAMSep_lisamap, wePMSep_lisamap, ncol = 2)Oct 2023
tmap_arrange(weAMOct_lisamap, wePMOct_lisamap, ncol = 2)5.4.3 Weekday versus Weekend/Holiday Morning
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(wdAMAug_lisamap, weAMAug_lisamap, ncol = 2)Sep 2023
tmap_arrange(wdAMSep_lisamap, weAMSep_lisamap, ncol = 2)Oct 2023
tmap_arrange(wdAMOct_lisamap, weAMOct_lisamap, ncol = 2)5.4.4 Weekday Afternoon versus Weekend/Holiday Evening
The plots from the three months are visualised in the interactive tmap mode below.
Aug 2023
tmap_arrange(wdPMAug_lisamap, wePMAug_lisamap, ncol = 2)Sep 2023
tmap_arrange(wdPMSep_lisamap, wePMSep_lisamap, ncol = 2)Oct 2023
tmap_arrange(wdPMOct_lisamap, wePMOct_lisamap, ncol = 2)5.4.5 Insights from Comparing Different Peak Periods
xxx